contrastive search
- Asia > China > Hong Kong (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (8 more...)
- Asia > China > Hong Kong (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (9 more...)
Context-Enhanced Contrastive Search for Improved LLM Text Generation
Sen, Jaydip, Pandey, Rohit, Waghela, Hetvi
--Recently, Large Language Models (LLMs) have demonstrated remarkable advancements in Natural Language Processing (NLP). However, generating high-quality text that balances coherence, diversity, and relevance remain s challenging. Traditional decoding methods, such as bean search and top-k sampling, often struggle with either repe titive or incoherent outputs, particularly in tasks that require long-form text generation. To address these limitations, the paper proposes a novel enhancement of the well-known Contrastive S earch algorithm, Context-Enhanced Contrastive Search (CEC S) with contextual calibration. The proposed scheme introduces several novelties including dynamic contextual importance w eighting, multi-level Contrastive Search, and adaptive temper ature control, to optimize the balance between fluency, creativity, and precision. The performance of CECS is evaluated usi ng several standard metrics such as BLEU, ROUGE, and semantic similarity. Experimental results demonstrate signif icant improvements in both coherence and relevance of the generated texts by CECS outperforming the existing Contrastive Search techniques. The proposed algorithm has several pote ntial applications in the real world including legal document drafting, customer service chatbots, and content marketing. In recent years, Large Language Models (LLMs) have transformed the field of Natural Language Processing (NLP), delivering cutting-edge performance across numerous tasks, including text generation, summarization, machine translation, and question answering. Models such as OpenAI's GPT-3 [1], Google's BERT [2], and more recently PaLM [3], have greatly enhanced the capabilities of machines in understanding and generating human language. By leveraging deep neural network architectures and training on extensive datasets, LLMs have made significant strides in pro ducing fluent and coherent text that closely resembles hum an communication. Generating text from an LLM involves more than simp ly predicting the next word in a sequence according to its probability distribution. This step, known as decod ing, plays a critical role in shaping the final output. Various decoding strategies have been proposed in the literature ranging from deterministic methods such as beam search, to stoch astic methods like top-k and nucleus sampling. While the deterministic methods choose the highest probability token at each step, their stochastic counterparts introduce randomness to improve diversity in the generated output.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > India > West Bengal > Kolkata (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Decoding Decoded: Understanding Hyperparameter Effects in Open-Ended Text Generation
Arias, Esteban Garces, Li, Meimingwei, Heumann, Christian, Aßenmacher, Matthias
Decoding strategies for generative large language models (LLMs) are a critical but often underexplored aspect of text generation tasks. Guided by specific hyperparameters, these strategies aim to transform the raw probability distributions produced by language models into coherent, fluent text. In this study, we undertake a large-scale empirical assessment of a range of decoding methods, open-source LLMs, textual domains, and evaluation protocols to determine how hyperparameter choices shape the outputs. Our experiments include both factual (e.g., news) and creative (e.g., fiction) domains, and incorporate a broad suite of automatic evaluation metrics alongside human judgments. Through extensive sensitivity analyses, we distill practical recommendations for selecting and tuning hyperparameters, noting that optimal configurations vary across models and tasks. By synthesizing these insights, this study provides actionable guidance for refining decoding strategies, enabling researchers and practitioners to achieve higher-quality, more reliable, and context-appropriate text generation outcomes.
- Europe > France (0.68)
- Asia > Afghanistan > Kabul Province > Kabul (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (8 more...)
- Research Report > New Finding (0.87)
- Research Report > Experimental Study (0.66)
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation
Arias, Esteban Garces, Rodemann, Julian, Li, Meimingwei, Heumann, Christian, Aßenmacher, Matthias
Decoding from the output distributions of large language models to produce high-quality text is a complex challenge in language modeling. Various approaches, such as beam search, sampling with temperature, $k-$sampling, nucleus $p-$sampling, typical decoding, contrastive decoding, and contrastive search, have been proposed to address this problem, aiming to improve coherence, diversity, as well as resemblance to human-generated text. In this study, we introduce adaptive contrastive search, a novel decoding strategy extending contrastive search by incorporating an adaptive degeneration penalty, guided by the estimated uncertainty of the model at each generation step. This strategy is designed to enhance both the creativity and diversity of the language modeling process while at the same time producing coherent and high-quality generated text output. Our findings indicate performance enhancement in both aspects, across different model architectures and datasets, underscoring the effectiveness of our method in text generation tasks. Our code base, datasets, and models are publicly available.
- Asia > Afghanistan > Kabul Province > Kabul (0.04)
- North America > United States > New York (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (7 more...)
- Transportation (1.00)
- Law Enforcement & Public Safety (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Military > Army (0.46)
Duwak: Dual Watermarks in Large Language Models
Zhu, Chaoyi, Galjaard, Jeroen, Chen, Pin-Yu, Chen, Lydia Y.
As large language models (LLM) are increasingly used for text generation tasks, it is critical to audit their usages, govern their applications, and mitigate their potential harms. Existing watermark techniques are shown effective in embedding single human-imperceptible and machine-detectable patterns without significantly affecting generated text quality and semantics. However, the efficiency in detecting watermarks, i.e., the minimum number of tokens required to assert detection with significance and robustness against post-editing, is still debatable. In this paper, we propose, Duwak, to fundamentally enhance the efficiency and quality of watermarking by embedding dual secret patterns in both token probability distribution and sampling schemes. To mitigate expression degradation caused by biasing toward certain tokens, we design a contrastive search to watermark the sampling scheme, which minimizes the token repetition and enhances the diversity. We theoretically explain the interdependency of the two watermarks within Duwak. We evaluate Duwak extensively on Llama2 under various post-editing attacks, against four state-of-the-art watermarking techniques and combinations of them. Our results show that Duwak marked text achieves the highest watermarked text quality at the lowest required token count for detection, up to 70% tokens less than existing approaches, especially under post paraphrasing.
- Europe > Netherlands > South Holland > Delft (0.05)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Fine-grained Conversational Decoding via Isotropic and Proximal Search
Yao, Yuxuan, Wu, Han, Xu, Qiling, Song, Linqi
General-purpose text decoding approaches are usually adopted for dialogue response generation. Although the quality of the generated responses can be improved with dialogue-specific encoding methods, conversational decoding methods are still under-explored. Inspired by \citet{wu2023learning} that a good dialogue feature space should follow the rules of locality and isotropy, we present a fine-grained conversational decoding method, termed \textit{isotropic and proximal search (IPS)}. Our method is designed to generate the semantic-concentrated response, while still maintaining informativeness and discrimination against the context. Experiments show that our approach outperforms existing decoding strategies in the dialogue field across both automatic and human evaluation metrics. More in-depth analyses further confirm the effectiveness of our approach.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (9 more...)
Fidelity-Enriched Contrastive Search: Reconciling the Faithfulness-Diversity Trade-Off in Text Generation
Chen, Wei-Lin, Wu, Cheng-Kuang, Chen, Hsin-Hsi, Chen, Chung-Chi
In this paper, we address the hallucination problem commonly found in natural language generation tasks. Language models often generate fluent and convincing content but can lack consistency with the provided source, resulting in potential inaccuracies. We propose a new decoding method called Fidelity-Enriched Contrastive Search (FECS), which augments the contrastive search framework with context-aware regularization terms. FECS promotes tokens that are semantically similar to the provided source while penalizing repetitiveness in the generated text. We demonstrate its effectiveness across two tasks prone to hallucination: abstractive summarization and dialogue generation. Results show that FECS consistently enhances faithfulness across various language model sizes while maintaining output diversity comparable to well-performing decoding algorithms.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Asia > Taiwan (0.05)
- North America > Dominican Republic (0.04)
- (9 more...)
Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization
Bonaldi, Helena, Attanasio, Giuseppe, Nozza, Debora, Guerini, Marco
Recent computational approaches for combating online hate speech involve the automatic generation of counter narratives by adapting Pretrained Transformer-based Language Models (PLMs) with human-curated data. This process, however, can produce in-domain overfitting, resulting in models generating acceptable narratives only for hatred similar to training data, with little portability to other targets or to real-world toxic language. This paper introduces novel attention regularization methodologies to improve the generalization capabilities of PLMs for counter narratives generation. Overfitting to training-specific terms is then discouraged, resulting in more diverse and richer narratives. We experiment with two attention-based regularization techniques on a benchmark English dataset. Regularized models produce better counter narratives than state-of-the-art approaches in most cases, both in terms of automatic metrics and human evaluation, especially when hateful targets are not present in the training data. This work paves the way for better and more flexible counter-speech generation models, a task for which datasets are highly challenging to produce.
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (9 more...)
- Law (0.46)
- Law Enforcement & Public Safety (0.46)
Contrastive Search Is What You Need For Neural Text Generation
Generating text with autoregressive language models (LMs) is of great importance to many natural language processing (NLP) applications. Previous solutions for this task often produce text that contains degenerative expressions or lacks semantic consistency. Recently, Su et al. introduced a new decoding method, contrastive search, based on the isotropic representation space of the language model and obtained new state of the art on various benchmarks. Additionally, Su et al. argued that the representations of autoregressive LMs (e.g. GPT-2) are intrinsically anisotropic which is also shared by previous studies. Therefore, to ensure the language model follows an isotropic distribution, Su et al. proposed a contrastive learning scheme, SimCTG, which calibrates the language model's representations through additional training. In this study, we first answer the question: "Are autoregressive LMs really anisotropic?". To this end, we extensively evaluate the isotropy of LMs across 16 major languages. Surprisingly, we find that the anisotropic problem only exists in the two specific English GPT-2-small/medium models. On the other hand, all other evaluated LMs are naturally isotropic which is in contrast to the conclusion drawn by previous studies. Based on our findings, we further assess the contrastive search decoding method using off-the-shelf LMs on four generation tasks across 16 languages. Our experimental results demonstrate that contrastive search significantly outperforms previous decoding methods without any additional training. More notably, on 12 out of the 16 evaluated languages, contrastive search performs comparably with human-level performances as judged by human evaluations. Our code and other related resources are publicly available at https://github.com/yxuansu/Contrastive_Search_Is_What_You_Need.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Asia > Indonesia (0.04)
- (8 more...)
- Leisure & Entertainment > Sports > Basketball (1.00)
- Information Technology (0.93)
- Leisure & Entertainment > Games (0.67)